49 research outputs found

    Potentials of Mean Force for Protein Structure Prediction Vindicated, Formalized and Generalized

    Get PDF
    Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge based potentials based on pairwise distances -- so-called "potentials of mean force" (PMFs) -- have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state -- a necessary component of these potentials -- is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities reference ratio distributions deriving from the application of the reference ratio method. This new view is not only of theoretical relevance, but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures

    Image-based Search and Retrieval for Biface Artefacts using Features Capturing Archaeologically Significant Characteristics

    Get PDF
    Archaeologists are currently producing huge numbers of digitized photographs to record and preserve artefact finds. These images are used to identify and categorize artefacts and reason about connections between artefacts and perform outreach to the public. However, finding specific types of images within collections remains a major challenge. Often, the metadata associated with images is sparse or is inconsistent. This makes keyword-based exploratory search difficult, leaving researchers to rely on serendipity and slowing down the research process. We present an image-based retrieval system that addresses this problem for biface artefacts. In order to identify artefact characteristics that need to be captured by image features, we conducted a contextual inquiry study with experts in bifaces. We then devised several descriptors for matching images of bifaces with similar artefacts. We evaluated the performance of these descriptors using measures that specifically look at the differences between the sets of images returned by the search system using different descriptors. Through this nuanced approach, we have provided a comprehensive analysis of the strengths and weaknesses of the different descriptors and identified implications for design in the search systems for archaeology

    Regulatory Elements within the Prodomain of Falcipain-2, a Cysteine Protease of the Malaria Parasite Plasmodium falciparum

    Get PDF
    Falcipain-2, a papain family cysteine protease of the malaria parasite Plasmodium falciparum, plays a key role in parasite hydrolysis of hemoglobin and is a potential chemotherapeutic target. As with many proteases, falcipain-2 is synthesized as a zymogen, and the prodomain inhibits activity of the mature enzyme. To investigate the mechanism of regulation of falcipain-2 by its prodomain, we expressed constructs encoding different portions of the prodomain and tested their ability to inhibit recombinant mature falcipain-2. We identified a C-terminal segment (Leu155–Asp243) of the prodomain, including two motifs (ERFNIN and GNFD) that are conserved in cathepsin L sub-family papain family proteases, as the mediator of prodomain inhibitory activity. Circular dichroism analysis showed that the prodomain including the C-terminal segment, but not constructs lacking this segment, was rich in secondary structure, suggesting that the segment plays a crucial role in protein folding. The falcipain-2 prodomain also efficiently inhibited other papain family proteases, including cathepsin K, cathepsin L, cathepsin B, and cruzain, but it did not inhibit cathepsin C or tested proteases of other classes. A structural model of pro-falcipain-2 was constructed by homology modeling based on crystallographic structures of mature falcipain-2, procathepsin K, procathepsin L, and procaricain, offering insights into the nature of the interaction between the prodomain and mature domain of falcipain-2 as well as into the broad specificity of inhibitory activity of the falcipain-2 prodomain

    Using neural networks and evolutionary information in decoy discrimination for protein tertiary structure prediction

    Get PDF
    Background: We present a novel method of protein fold decoy discrimination using machine learning, more specifically using neural networks. Here, decoy discrimination is represented as a machine learning problem, where neural networks are used to learn the native-like features of protein structures using a set of positive and negative training examples. A set of native protein structures provides the positive training examples, while negative training examples are simulated decoy structures obtained by reversing the sequences of native structures. Various features are extracted from the training dataset of positive and negative examples and used as inputs to the neural networks.Results: Results have shown that the best performing neural network is the one that uses input information comprising of PSI-BLAST [1] profiles of residue pairs, pairwise distance and the relative solvent accessibilities of the residues. This neural network is the best among all methods tested in discriminating the native structure from a set of decoys for all decoy datasets tested. Conclusion: This method is demonstrated to be viable, and furthermore evolutionary information is successfully used in the neural networks to improve decoy discrimination

    Segmentation of epidermal tissue with histopathological damage in images of haematoxylin and eosin stained human skin.

    Get PDF
    Background: Digital image analysis has the potential to address issues surrounding traditional histological techniques including a lack of objectivity and high variability, through the application of quantitative analysis. A key initial step in image analysis is the identification of regions of interest. A widely applied methodology is that of segmentation. This paper proposes the application of image analysis techniques to segment skin tissue with varying degrees of histopathological damage. The segmentation of human tissue is challenging as a consequence of the complexity of the tissue structures and inconsistencies in tissue preparation, hence there is a need for a new robust method with the capability to handle the additional challenges materialising from histopathological damage.Methods: A new algorithm has been developed which combines enhanced colour information, created following a transformation to the L*a*b* colourspace, with general image intensity information. A colour normalisation step is included to enhance the algorithm's robustness to variations in the lighting and staining of the input images. The resulting optimised image is subjected to thresholding and the segmentation is fine-tuned using a combination of morphological processing and object classification rules. The segmentation algorithm was tested on 40 digital images of haematoxylin & eosin (H&E) stained skin biopsies. Accuracy, sensitivity and specificity of the algorithmic procedure were assessed through the comparison of the proposed methodology against manual methods.Results: Experimental results show the proposed fully automated methodology segments the epidermis with a mean specificity of 97.7%, a mean sensitivity of 89.4% and a mean accuracy of 96.5%. When a simple user interaction step is included, the specificity increases to 98.0%, the sensitivity to 91.0% and the accuracy to 96.8%. The algorithm segments effectively for different severities of tissue damage.Conclusions: Epidermal segmentation is a crucial first step in a range of applications including melanoma detection and the assessment of histopathological damage in skin. The proposed methodology is able to segment the epidermis with different levels of histological damage. The basic method framework could be applied to segmentation of other epithelial tissues

    Local Alignment Refinement Using Structural Assessment

    Get PDF
    Homology modeling is the most commonly used technique to build a three-dimensional model for a protein sequence. It heavily relies on the quality of the sequence alignment between the protein to model and related proteins with a known three dimensional structure. Alignment quality can be assessed according to the physico-chemical properties of the three dimensional models it produces

    Structural Annotation of Mycobacterium tuberculosis Proteome

    Get PDF
    Of the ∼4000 ORFs identified through the genome sequence of Mycobacterium tuberculosis (TB) H37Rv, experimentally determined structures are available for 312. Since knowledge of protein structures is essential to obtain a high-resolution understanding of the underlying biology, we seek to obtain a structural annotation for the genome, using computational methods. Structural models were obtained and validated for ∼2877 ORFs, covering ∼70% of the genome. Functional annotation of each protein was based on fold-based functional assignments and a novel binding site based ligand association. New algorithms for binding site detection and genome scale binding site comparison at the structural level, recently reported from the laboratory, were utilized. Besides these, the annotation covers detection of various sequence and sub-structural motifs and quaternary structure predictions based on the corresponding templates. The study provides an opportunity to obtain a global perspective of the fold distribution in the genome. The annotation indicates that cellular metabolism can be achieved with only 219 folds. New insights about the folds that predominate in the genome, as well as the fold-combinations that make up multi-domain proteins are also obtained. 1728 binding pockets have been associated with ligands through binding site identification and sub-structure similarity analyses. The resource (http://proline.physics.iisc.ernet.in/Tbstructuralannotation), being one of the first to be based on structure-derived functional annotations at a genome scale, is expected to be useful for better understanding of TB and for application in drug discovery. The reported annotation pipeline is fairly generic and can be applied to other genomes as well

    A Kernel for Open Source Drug Discovery in Tropical Diseases

    Get PDF
    Open source drug discovery, a promising alternative avenue to conventional patent-based drug development, has so far remained elusive with few exceptions. A major stumbling block has been the absence of a critical mass of preexisting work that volunteers can improve through a series of granular contributions. This paper introduces the results from a newly assembled computational pipeline for identifying protein targets for drug discovery in ten organisms that cause tropical diseases. We have also experimentally tested two promising targets for their binding to commercially available drugs, validating one and invalidating the other. The resulting kernel provides a base of drug targets and lead candidates around which an open source community can nucleate. We invite readers to donate their judgment and in silico and in vitro experiments to develop these targets to the point where drug optimization can begin

    Trends in template/fragment-free protein structure prediction

    Get PDF
    Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward
    corecore